智能论文笔记

Topological Data Analysis for Speech Processing

Eduard Tulchinskii , Kristian Kuznetsov , Laida Kushnareva , Daniil Cherniavskii , Serguei Barannikov , Irina Piontkovskaya , Sergey Nikolenko , Evgeny Burnaev

分类：自然语言处理 | 机器学习

2022-11-30

We apply topological data analysis (TDA) to speech classification problems and to the introspection of a pretrained speech model, HuBERT. To this end, we introduce a number of topological and algebraic features derived from Transformer attention maps and embeddings. We show that a simple linear classifier built on top of such features outperforms a fine-tuned classification head. In particular, we achieve an improvement of about $9\%$ accuracy and $5\%$ ERR on four common datasets; on CREMA-D, the proposed feature set reaches a new state of the art performance with accuracy $80.155$. We also show that topological features are able to reveal functional roles of speech Transformer heads; e.g., we find the heads capable to distinguish between pairs of sample sources (natural/synthetic) or voices without any downstream fine-tuning. Our results demonstrate that TDA is a promising new approach for speech analysis, especially for tasks that require structural prediction.

translated by 谷歌翻译

我们将拓扑分析的方法应用于注意力图，该方法是根据BERT模型的注意力头（ARXIV：1810.04805V2）计算的。我们的研究表明，基于训练有素的神经网络的基本持久拓扑特征（即Betti数字）构建的分类器可以与常规分类方法达到分类结果。我们在三个文本分类基准上展示了此类拓扑文本表示形式的相关性。据我们所知，这是分析基于注意力的神经网络拓扑的首次尝试，该网络广泛用于自然语言处理。

translated by 谷歌翻译